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The coronavirus nucleocapsid (N) protein is a structural protein that forms a ribonucleoprotein complex 
with genomic RNA. In addition to its structural role, it has been described as an RNA-binding protein that 
might be involved in coronavirus RNA synthesis. Here, we report a reverse genetic approach to elucidate the 
role of N in coronavirus replication and transcription. We found that human coronavirus 229E (HCoV-229E) 
vector RNAs that lack the N gene were greatly impaired in their ability to replicate, whereas the transcription 
of subgenomic mRNA from these vectors was easily detectable. In contrast, vector RNAs encoding a functional 
N protein were able to carry out both replication and transcription. Furthermore, modification of the tran¬ 
scription signal required for the synthesis of N protein mRNAs in the HCoV-229E genome resulted in the 
selective replication of genomes that are able to express the N protein. This genetic evidence leads us to 
conclude that at least one coronavirus structural protein, the N protein, is involved in coronavirus replication. 


Coronaviruses are enveloped, positive-strand RNA viruses 
that are mainly associated with enteric or respiratory diseases 
in humans, companion animals, and livestock. Coronavirus 
particles contain a genomic RNA of approximately 27,000 to 
30,000 nucleotides and four structural proteins: the spike gly¬ 
coprotein S, the membrane protein M, the small envelope 
protein E, and the nucleocapsid protein N (44). Three of these 
four proteins are embedded in the viral envelope. These are 
the S protein, which mediates binding of the virus particle to 
the target cell and the subsequent fusion of viral and cellular 
membranes (8), the M protein, which has a crucial role in the 
incorporation of the virus nucleocapsid into virus particles (28, 
30, 31), and the E protein, which facilitates virus assembly, 
possibly by inducing curvature into pre-Golgi membranes, the 
site at which coronaviruses assemble by budding (13, 36). The 
fourth structural protein, N, is associated with the viral RNA 
genome to form a ribonucleoprotein complex (38). 

The coronavirus N protein has been described as a multi¬ 
functional protein displaying RNA-binding activity, protein- 
protein interaction (specifically with the M protein), and the 
ability for self-association (22, 29, 30). Clearly, many of these 
features will reflect the structural role of the N protein in the 
virus particle. However, there are also some observations that 
suggest the N protein may have a role in viral RNA synthesis. 
First, several studies have provided evidence for the binding of 
N protein to coronaviral RNA sequences that are involved in 
the regulation of RNA synthesis. These sequences include the 
coronavirus leader sequence, transcription regulatory se¬ 
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quences (TRS), and sequences corresponding to the 3' end of 
coronavirus genomes (2, 10, 11, 24, 34, 48). Second, it has been 
shown that in addition to a cytoplasmic distribution within the 
host cell, at least a fraction of the coronavirus N protein colo¬ 
calizes with replicative proteins at the sites of viral RNA syn¬ 
thesis early in infection (12, 45, 55). Third, it has been dem¬ 
onstrated that there is a requirement for sustained translation 
of the N protein in trans or in cis for optimal replication of 
bovine coronavirus DI RNA and transmissible gastroenteritis 
virus-derived replicons (1, 9). Fourth, we and others have re¬ 
ported that the recovery of recombinant coronaviruses can be 
facilitated under conditions that allow sustained N protein 
expression (7, 59). And fifth, heterologous gene expression 
from coronavirus-based multigene vectors is greatly improved 
by cotransfection of vector RNA with a synthetic mRNA en¬ 
coding the N protein (52). 

Coronavirus RNA synthesis can be divided formally into two 
processes: replication of full-length virus genomes and tran¬ 
scription of subgenomic mRNAs. Genome replication is me¬ 
diated through a full-length minus-strand copy of the genome 
that serves as a template for the production of progeny virus 
genomes. Coronavirus transcription is a complex process in¬ 
volving the discontinuous synthesis of up to eight minus-strand 
RNAs of subgenomic size which contain sequences corre¬ 
sponding to the 5' and 3' ends of the genome and serve as 
templates for the synthesis of a 5'- and 3'-coterminaI set of 
subgenomic mRNAs (40, 47). According to the current model 
of coronavirus discontinuous extension, the synthesis of nas¬ 
cent minus-strand RNA initiates at the 3' end of the plus- 
strand genomic template and proceeds to TRS elements that 
are located at various sites within the virus genome (39, 41, 61). 
At these sites, a proportion of the nascent minus-strand RNA 
is translocated to the 5' leader sequence of the genome, and 
subsequently a minus-strand copy of the leader is added (42, 
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57, 61). Thus, the TRS elements determine the fusion sites of 
leader and “body”-derived sequences (5'- and 3'-end-derived 
sequences, respectively) of the subgenomic RNAs, and the 
number of TRS elements correlates with the number of sub¬ 
genomic mRNAs produced by a particular coronavirus. 

In both replication and transcription, proteins encoded by 
the replicase gene are critically involved. The coronavirus rep- 
licase proteins are synthesized as large polyproteins that are 
extensively processed by virus-encoded proteinases to produce 
a functional replicase/transcriptase complex (60). It has been 
suggested that his complex comprises so-called scaffolding pro¬ 
teins, together with proteins that have enzymatic functions, 
including RNA-dependent RNA polymerase, helicase, and uri- 
dylate-specific endoribonuclease (NendoU), and putative exo¬ 
nuclease and 2'-0-ribose methyl transferase activities (44). In 
addition, the active replicase/transcriptase complex may con¬ 
tain other proteins of viral or cellular origin (17, 21, 26, 27, 43, 
44). These proteins might include the N protein. On the other 
hand, Snijder and colleagues have shown that all structural 
proteins of the closely related arterivirus equine arteritis virus, 
including the N protein, are dispensable for genome replica¬ 
tion and subgenomic RNA synthesis (25); we have shown that 
the viral replicase gene products alone suffice for coronavirus 
transcription (50). 

In this paper, we used a reverse genetic approach to study N 
protein function in the context of viral RNA synthesis. Our 
data suggest a role for the coronavirus N protein in virus 
replication. We show that efficient, sustained replication of 
coronavirus-based vector RNAs only takes place if the N pro¬ 
tein is expressed by the vector RNA. We also show that the 
level of N protein expression determines the level of vector 
RNA-mediated reporter gene expression, which again is con¬ 
sistent with a role for N in vector RNA replication. And finally, 
by constructing and monitoring the replication of recombinant, 
mutated coranavirus genomes, we provide genetic evidence 
that sustained expression of the N protein facilitates replica¬ 
tion of the genomic RNA. 

MATERIALS AND METHODS 

Viruses and cells. BHK-21 and CV-1 cells were purchased from the European 
Collection of Cell Cultures. D980R cells were a kind gift from G. L. Smith, 
Imperial College, London, United Kingdom. All cells were maintained in min¬ 
imal essential medium supplemented with fetal bovine serum (5 to 10%) and 
antibiotics. Recombinant vaccinia viruses were propagated, titers were deter¬ 
mined, and the viruses were purified as previously described (49). 

Cloning of plasmid DNAs and recombinant vaccinia viruses. The isolation of 
recombinant vaccinia viruses vHCoV-vec-1, vHCoV-vec-CLG, and vHCoV- 
vec-GN has been described previously (15, 50, 52). The isolation of recombinant 
vaccinia virus vHCoV-Sfil is based on the recombinant vaccinia virus vHCoV- 
inf-1, which contains the full-length cDNA of the human coronavirus 229E 
(HCoV-229E) genome (49). The first step, the generation of recombinant vac¬ 
cinia virus vRec-1, has been described previously (15). The genome of vaccinia 
virus vRec-1 contains a cDNA insert corresponding to HCoV-229E nucleotides 
(nt) 1 to 21145, followed by a 2.1-kbp DNA fragment containing the Escheiichia 
coli guanine phosphoribosyltransferase gene (gpt) under the control of a vaccinia 
virus promoter and a cDNA fragment corresponding to the 3' end of the HCoV- 
229E genome, starting at nt 24646. The second step involved homologous re¬ 
combination of vRec-1 DNA with pHCoV-Sfil DNA and the isolation of vaccinia 
virus vHCoV-Sfil using gpt -negative selection in D980R cells. To clone the 
plasmid pHCoV-Sfil, a PCR product corresponding to HCoV-229E nt 20478 to 
24091 was generated using primers Sacl-20500up (5'-ACGTGAGCTCGGTGC 
TTAGTCTTGTTAGGAGTGG-3') and StopS-Sndown (5'-ACGTGCGGCCG 
CGGCCATTACGGCCTTACTGTATGTGGATCTTTTCAACG-3') and vH- 
CoV-inf-1 DNA as a template. The PCR product was cleaved with SacI and Notl 


and cloned in pBluescriptll KS(+) (Stratagene). The resulting plasmid DNA was 
cleaved with Notl and Hindlll, and a Notl-Hindlll DNA fragment derived from 
Notl-Hindlll-cleaved vaccinia virus vNotl/tk (23) genomic DNA (corresponding 
to the DNA sequence located downstream of the 3' end of the HCoV-229E 
genome in the vaccinia virus vRec-1 genome) was inserted. The identity of 
pHCoV-Sfil was confirmed by sequencing analysis and used for homologous 
recombination with vRec-1 DNA to generate the recombinant vaccinia virus 
vHCoV-Sfil. Note that the HCoV-229E-derived sequence of vHCoV-Sfil en¬ 
compasses the HCoV-229E nt 1 to 24091 (nt 24089 to 24091 represent the stop 
codon of the HCoV-229E S gene), followed by a Sfil restriction site which is 
unique in the vHCoV-Sfil genome. 

To construct the plasmid pMHV-N, cDNA was amplified by RT-PCR using 
primer Sac-MN-ATG (5'-ACGTAGACTCACCATGTCTTTTGTTCCTGGGC 
AAGAAAATGC-3'), primer Bam-MN-stop (5'-ACGTGGATCCTTACACAT 
TAGAGTCATCTTCTAACC-3'), and vMHV-inf-1 DNA (9a). as a template. 
The resulting PCR product, which contains the mouse hepatitis virus (MHV) N 
gene, was cleaved with SacI and BamHI and cloned into pBluescriptll KS(+). 

To construct the plasmid pTRE-HN, the HCoV-229E N gene was amplified by 
PCR using the oligonucleotide primer SacII-HN-ATG (5'-ACGTAGAGCTCA 
CCATGGCTACAGTCAAA TGGGCTGATGC-3', primer Bam-HN-Stop (5'- 
ACGTGGATCCTTAGTTTACTTCATCAATTATGTCAG-3'), and vHCoV- 
inf-1 DNA as a template. The PCR product was cleaved with SacII and BamHI 
and ligated into the SacII- and BamHI-cleaved plasmid pTRE DNA. 

Generation of synthetic N protein cDNA templates for in vitro transcription. 
Plasmid pME (49) was used as template to generate a PCR product containing 
the HCoV-229E N gene using primers Nup (5'-ACGTAATACGACTCACTA 
TAGGGACGAAACCATGGCTACAGTC AAATGGGCTG-3') and Ndown 
(5' -TTTTTTTTTTTTTTTTTTTTCAAACAACACAGTGGCA TGTTTAG-3'). 
The PCR product was used as template for in vitro transcription of a mRNA 
encoding the HCoV-229E N protein. Plasmid pMSN was a kind gift of M. Acker- 
mann and K. Tobler, Department of Veterinary Medicine, University of Zurich, 
Zurich, Switzerland, and contains the porcine epidemic diarrhea virus (PEDV) N 
gene located downstream of a bacteriophage T7 RNA polymerase promoter. pMSN 
was linearized with Notl prior to in vitro transcription of a synthetic mRNA encod¬ 
ing the PEDV N protein. Plasmid pMHV-N was linearized with Xhol prior to in 
vitro transcription of a synthetic mRNA encoding the MHV N protein. 

Generation of a stable cell line expressing the HCoV-229E N protein. To 
generate a stable cell line expressing the HCoV-229E N protein, BHK-21 cells 
were transfected with 5 pig of plasmid pcEF Tet . 0n /NEO (37) and a stable cell 
line, designated BHK-Tet/On, which expresses the Tet-activator protein rtTA, 
was selected using G418 (400 to 800 |xg/ml). BHK-Tet/On cells were transfected 
with 2.5 jxg of plasmid pTRE-HN and 2.5 |xg of plasmid pTK-Hyg (Clontech); a 
stable cell line, BHK-HCoV-N, expressing the HCoV-229E N protein in the 
presence of doxycyclin, was selected with hygromycin B (300 (xg/ml). N protein 
expression in BHK-HCoV-N cells was analyzed by Western blotting using mu¬ 
rine monoclonal antibody NG12 (16) as primary antibody and NA 931 (sheep 
anti-mouse immunoglobulin G, peroxidase-linked; Amersham) as secondary an¬ 
tibody in combination with ECL Western blotting detection reagents (Amer¬ 
sham), according to the manufacturer’s instructions. 

Generation of recombinant HCoV-229E cDNA containing randomized se¬ 
quences. A DNA fragment containing the HCoV-229E-derived nt 24097 to 
27277 with a randomized sequence at HCoV-229E nt 25671 to 25673 was gen¬ 
erated. Two PCRs were carried out with plasmid pME as template and primers 
(i) Oli 17 (5'-TGTGGTGAGTATGTTGCT-3') and EspTRSNr-down (5'-ACG 
TCGTCTCTTCAGNNNAGAAAAAATGAAGCAATCTTTCGTTTTCTGT 
C-3'; N represents any nucleotide) or (ii) EspTRSNr-up (5'-ACGTCGTCTCC 
CTGAACGAAAAGATGGCTACAGTCAAATGGGC-3') and Hend-polyA 

Both PCR products were cleaved with Esp3I, then ligated in vitro, and used as 
a template for another PCR with primers Sfi4a-up (5'-ACGTGGCCGTAATG 
GCCCTAGGTTTGTTCACATTGCAACTTGTG-3') and Hend-polyA. The re¬ 
sulting PCR product contains HCoV-derived nt 24097 to 27277 (with random¬ 
ized nt 25671 to 25673) plus a Sfil cleavage site and a synthetic poly(A) sequence 
encoded upstream and downstream, respectively. After cleavage with Sfil, the 
DNA fragment was ligated in vitro with Sfi-cleaved genomic DNA from purified 
recombinant vaccinia virus vHCoV-Sfil, and the resulting ligation product was 
used as template for bacteriophage T7 RNA polymerase-driven in vitro tran¬ 
scription. 

In vitro transcription and electroporation. In vitro transcription using bacte¬ 
riophage T7 RNA polymerase in the presence of an m7G(5')ppp(5')G cap 
analog was done as described previously (49). Vector RNA (15 |xg) or full-length 
recombinant HCoV-229E RNA was electroporated as previously described (50) 
into BHK-HCoV-N or BHK-21 cells with or without 5 p,g in vitro-transcribed 
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RNA encoding the HCoV, PEDV, or MHV N proteins. A synthetic mRNA 
encoding a truncated E. coli (3-galactosidase protein has been described previ¬ 
ously (54), and 5 pig was used as a negative control for electroporation. 

Reporter protein analysis. Reporter protein expression was analyzed 3 days 
postelectroporation. Green fluorescent protein (GFP) expression was analyzed 
by fluorescence microscopy with a Leica DM IL fluorescence microscope and 
flow cytometry was analyzed wtih a FACSCalibur and CellQuest software (BD 
Biosciences). Sorting of green fluorescent BHK-21 cells that were transfected 
with HCoV-vec-1 RNA and HCoV-229E-N mRNA was done using a FACS- 
Vantage Instrument (BD Biosciences). Firefly luciferase expression was analyzed 
using the luciferase reporter gene assay (Boehringer, Mannheim, Germany), as 
described by the manufacturer. 

RNA preparation, Northern blotting, RT-PCR, and sequencing analysis. 

Poly(A)-containing RNA was isolated from BHK-21 cells using oligo(dT ) 2 5 
Dynabeads (Dynal, Oslo, Norway) as previously described (53). Northern blot 
analysis involved electrophoresis and transfer to nylon membranes as previously 
described (51). To detect HCoV-229E-specific RNAs, a 32 P-labeled probe cor¬ 
responding to the HCoV-229E nucleotides 26297 to 27273 was produced using 
the Multiprime DNA-labeling system (Amersham). Reverse transcription-PCR 
(RT-PCR) was done with Superscript II reverse transcriptase (Invitrogen) as 
described previously (53). To amplify the region containing the HCoV-229E 
TRS-N region by RT-PCR, in vitro transcripts or poly(A)-containing RNA from 
transfected cells was used as a template for reverse transcription with primer 
25950-down (5'-GCATCTTTATGGGGTCCTGTGCC-3'). The RT products 
were used as templates to amplify the TRS-N region corresponding to the 
HCoV-229E genome and subgenomic N protein mRNAs using (semi)nested 
PCR protocols. To amplify the TRS-N region corresponding to recombinant 
HCoV-229E genomes, primers 25500-up (5' -CATGACAGTTGCCGTGCCGA 
GCAC-3') and 25900-down (5'-TGACAAATCCACCCGTTTGCCCT-3') were 
used for the first PCR; primers 25630-up (5' -CTGCAGTGAGCTCTCCCATG 
AGCA-3') and 25820-down (5 '-GTCTTTCTTGTTGATGGGTACC-3') were 
used for the nested PCR. To amplify the leader-body junction of N protein 
mRNAs, the leader-specific primer 25-up (5'-CTTAAGTACCTTATCTATCT 
ACAGATAG-3') and primer 25900-down were used for the first PCR, and 
primers 25-up and 25820-down were used for the seminested PCR. The resulting 
PCR products were analyzed directly by sequencing using primer 25820-down or 
cloned into plasmid pGem-T (Promega). The resulting plasmid clones have been 
sequenced to determine the sequence of individual genomic or subgenomic 
mRNAs at the TRS-N region. Sequencing analysis of plasmid DNA, RT-PCR 
products, and the recombinant vaccinia virus cDNA inserts was done by standard 
cycle sequencing methods with an ABI 310 Prism Genetic Analyzer. Computer- 
assisted analysis of sequence data was facilitated by the Lasergene bio-computing 
software (DNASTAR). 


RESULTS 

Cotransfection of HCoV-229E vector RNAs with N protein 
mRNAs. The aim of this study was to analyze the coronavirus 
N protein function(s) that might be related to viral RNA syn¬ 
thesis by using a reverse genetic approach based on HCoV- 
229E. The recombinant RNA constructs used in this study are 
illustrated in Fig. 1. We previously reported that transfection 
of a HCoV-based vector RNA, designated HCoV-vec-1, into 
BHK-21 cells resulted in the synthesis of a subgenomic mRNA 
that contained a leader-body fusion site characteristic of coro¬ 
navirus transcription (50). These data indicate that the repli- 
case gene products suffice for coronavirus transcription be¬ 
cause the only viral proteins encoded by the HCoV-vec-1 RNA 
are derived from the replicase gene. Similar results have been 
obtained using another vector RNA, designated HCoV-vec- 
CLG (52). We have also reported that after transfection of 
GFP-encoding HCoV-229E-based vector RNAs (HCoV-vec-1 
or HCoV-vec-CLG) only very few transfected cells (<0.1%) 
displayed green fluorescence. However, if a synthetic mRNA 
encoding the HCoV-229E N protein was cotransfected with 
HCoV-vec-CLG RNA, we found that the percentage of green 
fluorescent cells increased up to 3%. When we used a synthetic 
RNA where the AUG start codon of the N protein gene was 
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FIG. 1. Structure of recombinant HCoV-229E and vector RNAs. 
(A) The structural relationship and sizes of HCoV-229E and vector 
RNAs used in this study are shown. Open reading frames are indicated 
as boxes designated by encoded gene products. L, leader RNA se¬ 
quence; An, poly(A) sequence. (B) The structure of recombinant 
HCoV-229E genomes with a randomized N gene TRS is shown (bot¬ 
tom). Also shown are the cDNAs that have been ligated in vitro (at the 
Sfil site) to produce a full-length HCoV-229E cDNA template for in 
vitro transcription. 


changed to CCG, we again observed <0.1% of green fluores¬ 
cent cells after cotransfection with HCoV-vec-CLG RNA (52). 
This indicates that the N protein rather than the N mRNA 
sequence is responsible for the observed effect. 

Based on these observations, we constructed a vector DNA, 
designated HCoV-vec-GN, that transcribes to produce an 
RNA containing the N protein gene in addition to the HCoV- 
229E replicase gene and the GFP reporter gene (Fig. 1). The 
N gene in HCoV-vec-GN is located downstream of the GFP 
gene, and transcription of an N gene mRNA is driven by the 
authentic TRS-N region, corresponding to the HCoV-229E nt 
25654 to 25685 (including the TRS-N core sequence 25668 UC 
UAAACU 25675 ). As shown in Fig. 2, transfection of HCoV- 
vec-GN RNA into BHK-21 cells resulted in <0.1% green 
fluorescent cells. In contrast, cotransfection of a synthetic 
mRNA encoding the HCoV-229E N protein together with 
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FIG. 2. Transfection of vector RNAs. Vector RNA HCoV-vec-GN 
or HCoV-vec-1 (each, 15 (rg) was transfected into BHK-21 cells with 
or without synthetic mRNAs (5 n-g) encoding the HCoV-229E, PEDV, 
or MHV N proteins or a truncated (3-galactosidase protein as indi¬ 
cated. Three days posttransfection, the percentage of green fluorescent 
cells was analyzed by flow cytometry. Each column represents the 
mean value of three independent experiments. 

HCoV-vec-GN RNA resulted in an increased number of green 
fluorescent cells (3.5%). These cells displayed relatively in¬ 
tense fluorescence 16 h posttransfection. This contrasts with 
the result observed when HCoV-vec-1 RNA (which does not 
encode the HCoV-229E N protein) was cotransfected with the 
mRNA encoding the HCoV-229E N protein. Again, an ele¬ 
vated number of green fluorescent cells (3.4%) (Fig. 2) were 
seen, but in this case the intensity of fluorescence was lower 
and only became apparent 48 h posttransfection. To show that 
the elevated percentage of green fluorescent cells did not re¬ 
sult from the cotransfection of any RNA, we cotransfected 
mRNA encoding a truncated E. coli (3-galactosidase protein 
with the HCoV-vec-1 RNA. Again, <0.1% of green fluores¬ 
cent cells were detected (Fig. 2). 

Thus, we conclude that cotransfection of mRNA encoding 
the ffCoV-229E N protein with FICoV-229E-based vector 
RNAs encoding GFP increased the number of green fluores¬ 
cent cells. Even if the N protein was encoded by the vector 
RNA itself, the number of GFP-expressing cells remained at a 
low level when the vector RNA was transfected alone. These 
results show that the presence of N protein early after trans¬ 
fection of vector RNAs (i.e., N protein that is provided in 
trans) enhances the level of coronavirus-mediated subgenomic 
mRNA transcription. However, they do not allow us to con¬ 
clude whether this is due to increased transcription from tem¬ 
plates derived from the input vector RNA or from de novo 
synthesized (i.e., replicating) vector RNA. 

Next, we asked whether the N proteins of other coronavi- 
ruses could provide the same trans -active function(s) as the 
HCoV N protein in the system described above. Thus, we 
produced two further synthetic mRNAs encoding the MHV 
and PEDV N proteins, respectively. As shown in Fig. 2, co¬ 


transfection of HCoV-vec-1 RNA with mRNAs encoding the 
MHV or PEDV N proteins resulted in <0.1 and 0.9% green 
fluorescent cells, respectively. We conclude that only the 
PEDV N protein mRNA gave rise to an increased number of 
green fluorescent cells, although less effectively than the 
HCoV-229E N protein mRNA. Notably, PEDV is a group I 
coronavirus and closely related to HCoV-229E (6, 20), whereas 
MHV is a group II coronavirus and more distantly related to 
HCoV-229E. 

Detection of HCoV-229E vector-specific RNAs by Northern 
blot analysis. In addition to measuring the levels of GFP ex¬ 
pression, we also used a 32 P-labeled probe, corresponding to 
the 3' end of the HCoV-229E genome, to visualize the vector- 
specific RNAs produced in transfected cells. Despite using 
probes of high specific activity, we were not able to detect any 
RNAs from cells that were transfected with HCoV-vec-1 or 
HCoV-vec-GN vector RNA alone. After cotransfection of 
HCoV-229E N mRNA with HCoV-vec-1 RNA, we could de¬ 
tect a faint signal for a subgenomic mRNA encoding GFP 
(data not shown). This contrasted with the cotransfection of 
HCoV-229E N mRNA with HCoV-vec-GN RNA, where we 
could easily detect the vector RNA and two mRNAs encoding 
GFP and N protein (Fig. 3, lane 11). In order to increase the 
sensitivity of RNA analysis from cells that have been trans¬ 
fected with HCoV-229E N mRNA and HCoV-vec-1 RNA, we 
sorted the green fluorescent cells prior to the isolation of 
poly(A)-containing RNA. In this case, an RNA that corre¬ 
sponds to a subgenomic mRNA encoding GFP was easily de¬ 
tectable (Fig. 3, lane 7), but we were unable to detect full- 
length HCoV-vec-1 RNA (Fig. 3, lane 10), even after 
prolonged autoradiography. 

To exclude degradation or the incomplete transfer of full- 
length HCoV-vec-1 RNA during RNA preparation, gel elec¬ 
trophoresis, or Northern blotting, we also applied poly(A)- 
containing RNA from HCoV-229E-infected cells (Fig. 3, lane 
1) and 1 ng of in vitro-transcribed HCoV-vec-1 RNA and 
HCoV-vec-CLG to the same gel (Fig. 3, lanes 3 and 4, respec¬ 
tively, and the corresponding lanes 8 and 9 after prolonged 
autoradiography). In each case, virus and vector RNAs were 
detectable, irrespective of their size. To exclude that the RNA 
detected in HCoV-vec-1 and HCoV-229E N mRNA-trans- 
fected cells corresponded to the transfected N protein mRNA, 
we applied 1 ng of in vitro-transcribed N protein mRNA (i.e., 
used for cotransfection with HCoV-vec-1 RNA) to the same 
gel. This RNA was also easily detectable (Fig. 3, lane 5) and, 
consistent with its smaller size, migrated faster in the gel than 
the subgenomic mRNA detected in HCoV-vec-l-transfected 
cells. 

These results show that vector RNAs encoding a functional 
N gene were able to transcribe and replicate RNA, whereas 
vector RNAs lacking the N gene were able to transcribe RNA 
but replication of full-length vector RNA was not detectable. 
However, these results cannot rigorously rule out the possibil¬ 
ity that the N gene provides a as-acting RNA signal that 
enhances replication. To address this question, we coelectro¬ 
porated the HCoV-vec-CLG vector RNA (lacking the N gene) 
with recombinant full-length virus genome and an mRNA en¬ 
coding the N protein into BHK-21 cells. In this case, rapidly 
available N protein was supplied by translation of the cotrans¬ 
fected N protein mRNA and sustained N protein expression 
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FIG. 3. Northern blot analysis. Poly(A)-containing RNA was isolated from HCoV-229E-infected MRC-5 cells (lane 1), BHK-21 cells (lane 6), 
BHK-21 cells that have been transfected with HCoV-229E N protein mRNA and FICoV-vec-1 RNA (lanes 7 and 10), BHK-21 cells that have been 
transfected with HCoV-229E N protein mRNA and HCoV-vec-GN RNA (lane 11), or BHK-21 cells that have been transfected with HCoV-229E 
N protein mRNA, HCoV-vec-CLG RNA, and HCoV-229E genomic RNA (lane 12). The RNA was analyzed by Northern blotting as described 
in Materials and Methods. Lane 2, no RNA; lanes 3 and 8,1 ng of in vitro-transcribed HCoV-vec-1 RNA; lanes 4 and 9,1 ng of in vitro-transcribed 
HCoV-vec-CLG RNA; lane 5, 1 ng of in vitro-transcribed HCoV-229E N protein mRNA. Lanes 8, 9, and 10 represent lanes 3, 4, and 7, 
respectively, after prolonged autoradiography. Genomic and subgenomic RNAs of HCoV-229E (RNA1 to -7) and HCoV-229E vector RNAs 
(arrows) are indicated. 


was mediated by the full-length genomic RNA in trans. Again, 
3 days posttransfection we analyzed poly(A)-containing RNA 
from transfected cells by Northern blotting. As shown in Fig. 3, 
lane 12, the full-length FICoV-vec-CLG RNA is readily visible, 
indicating that replication of vector RNA in the presence of 
replicating HCoV-229E virus had occurred. In summary, our 
data indicate that initially, after transfection of vector RNAs, 
the N protein (provided by cotransfected N protein mRNA) 
may be important for the establishment of a replicase/tran- 
scriptase complex; however, sustained N protein expression 
(e.g., if the N protein was encoded by the vector RNA itself or 
provided in trans by replicating virus) was required for efficient 
replication. 

The level of HCoV-229E vector-mediated reporter gene ex¬ 
pression correlates with the level of HCoV-229E N protein 
expression. The experiments described above demonstrate 
that cotransfection of an HCoV-229E N protein mRNA with 
HCoV-229E vector RNAs increases the number of green flu¬ 
orescent cells. Even if the N protein is encoded by the vector 
RNA, cotransfection of an N protein mRNA is required, in¬ 
dicating that the N protein must provide a function that facil¬ 
itates the establishment of a functional transcriptase or repli- 
case complex early after transfection. To determine whether 
this effect was dose dependent, we produced a stable, regulat- 
able BHK-21-derived cell line expressing the HCoV-229E N 
protein. This cell line, designated BHK-HCoV-N, is based on 
the Tet/ON system. The analysis of N protein expression by 
Western blotting revealed that the HCoV-229E N protein can 
be expressed to different levels with different concentrations of 
doxycyclin in the tissue culture medium (Fig. 4A). We have 
used this cell line for the transfection of 15 |xg in vitro-tran¬ 
scribed HCoV-vec-CLG RNA and analyzed the vector-medi¬ 
ated expression of GFP and firefly luciferase 3 days posttrans¬ 


fection. The average percentage of green fluorescent cells in 
these experiments was approximately 3%, independent of the 
doxycylin concentration used in the medium (data not shown). 
This indicates first that cotransfection of an mRNA encoding 
the N protein with the vector RNA is no longer needed in N 
protein-expressing cells, and second, that the level of N protein 
expression did not influence the number of green fluorescent 
cells. However, as shown in Fig. 4B, the level of N protein 
expression did influence the level of reporter protein expres¬ 
sion. At doxycyclin concentrations of 0.1 and 1 p,g/ml, the 
firefly luciferase expression was about 30-fold higher than lev¬ 
els observed with HCoV-vec-CLG RNA-transfected cells in 
the absence of doxycyclin. Thus, although the overall number 
of green fluorescent cells did not change, the level of reporter 
protein expression increased at higher levels of N protein ex¬ 
pression. 

Selection of replication-competent HCoV genomes. Having 
shown that the HCoV-229E N protein provides a function that 
increases the number of green fluorescent cells and the level of 
reporter protein expression after transfection of HCoV-229E- 
based vector RNAs, we asked whether the N protein is in¬ 
volved in coronavirus replication or coronavirus transcription. 
As mentioned above, we were unable to detect efficient vector 
RNA replication when the N protein was not encoded by the 
vector RNA (Fig. 3, lanes 7 and 10). However, as soon as 
sustained N protein expression was ensured, full-length vector 
RNAs became detectable by Northern blot analysis (Fig. 3, 
lanes 11 and 12). Accordingly, we reasoned that HCoV-229E 
genomes do not replicate efficiently if they are unable to me¬ 
diate N protein expression. To test this hypothesis, we decided 
to analyze the replication of HCoV-229E genomes that had 
been modified at the TRS of the N gene (TRS-N), a cis -acting 
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FIG. 4. Firefly luciferase reporter protein analysis in BHK- 
HCoV-N cells. (A) Western blot analysis of BHK-HCoV-N cells. Each 
lane corresponds to a cytoplasmic lysate derived from 2 X 10 s BHK- 
HCoV-N cells that were cultivated for 2 days in medium without or 
with 0.01, 0.1, or 1 p-g/ml doxycyclin (Dox), as indicated. For compar¬ 
ison, a Western blot analysis of cytoplamic lysate derived from 2 X 10 s 
HCoV-229E-infected MRC-5 cells is shown. (B)A total of 15 p,g of 
HCoV-vec-CLG vector RNA was transfected into 10 7 BHK-HCoV-N 
cells. After transfection, the cells were split into four wells containing 
medium without or with 0.01, 0.1, or 1 p-g/ml Dox. Luciferase activity 
was analyzed 3 days posttransfection. Results are shown for three 
independent experiments. In each experiment, luciferase activity of 
transfected BHK-HCoV-N cells that were cultivated without Dox were 
set as 1 relative unit. Bars indicate 95% confidence intervals. 


RNA element which is required for the production of a sub- 
genomic mRNA encoding the N protein. 

The fusion of leader and body-derived sequences of the 
subgenomic HCoV-229E N protein mRNA has been deter¬ 
mined to occur at the TRS-N sequence 25668 UCUAAACU 
25675 (the numbers indicate HCoV-229E nucleotide positions) 
within the HCoV-229E genome (14). We made use of our 
full-length cDNA clone of HCoV-229E and modified the 
TRS-N to 25668 UCUNNNCU 25675 , where N represents any 
nucleotide (Fig. 5A). The stretch of three random nucleotides 
within the TRS-N was introduced into a full-length cDNA of 
HCoV-229E by PCR and in vitro ligation as described in Ma¬ 
terials and Methods. Following in vitro transcription, a popu¬ 
lation of (theoretically) 64 different full-length recombinant 


FICoV-229E genomes was produced. These RNA molecules 
were then transfected into BHK-21 cells (which are not sus¬ 
ceptible to FICoV-229E infection), and 3 days later poly(A)- 
containing RNAs were isolated. We then compared the se¬ 
quences of the input RNA genomes with those of the 
reisolated RNA genomes at the TRS-N region. To do this, 
DNA fragments containing the TRS-N region were amplified 
by RT-PCR from the RNA genomes, and a consensus se¬ 
quence was determined. As shown in Fig. 5B, the reisolated 
RNA genomes have clearly undergone selection, since the 
nucleotides at the randomized positions had shifted to a pre¬ 
dominance of adenines. Flowever, there was also a dominant 
uridine peak detectable, corresponding to HCoV-229E nt 
25673. To determine the sequences of individual RNA ge¬ 
nomes, we cloned the RT-PCR products and sequenced a total 
of 44 plasmid DNAs derived from the input RNA population 
and 41 plasmid DNAs derived from the reisolated RNA pop¬ 
ulation. The result of this analysis is summarized in Fig. 5C. 
HCoV-229E genomes that contained the authentic TRS-N 
sequence were represented at 4.5% in the input RNA popu¬ 
lation, and the percentage of these molecules increased to 
9.8% in the reisolated RNA population. Similarly, the percent¬ 
age of genomes that contained a uridine at position 25673 was 
20.5% of the input RNAs and 41.5% of the reisolated RNAs. 
Thus, these two groups had obviously undergone positive se¬ 
lection during amplification in BF1K-21 cells. Sequences that 
contained one nucleotide difference, compared to the authen¬ 
tic TRS-N ( 25668 UCUAAACU 25675 ) or leader-TRS ( 62 UCUC 
AACU 69 ), remained approximately at the same level (31.8% 
and 29.3% for input and reisolated RNAs, respectively). Other 
sequences that did not match to the groups mentioned above 
had presumably undergone negative selection in BF1K-21 cells, 
since their percentage dropped from 43.2% in the input RNAs 
to 19.5% in the reisolated RNAs. 

Analysis of subgenomic N protein mRNAs. The finding that 
specific HCoV-229E genomes within the transfected popula¬ 
tion had undergone positive selection in nonsusceptible 
BF1K-21 cells indicates that these genomes were replicated 
preferentially. If this phenotype is related to the ability to 
transcribe N protein mRNA, it should be possible to find N 
protein subgenomic mRNAs; furthermore, the sequence at the 
randomized TRS-N positions of these mRNAs should corre¬ 
late with the sequences of the selected genomes. Therefore, we 
did an RT-PCR to specifically amplify N protein mRNAs that 
had been produced in the transfected cells. Using a leader 
RNA-specific primer and a body RNA-specific primer in the 
PCR, we were able to obtain RT-PCR products from reiso¬ 
lated poly(A)-containing RNAs that corresponded to N pro¬ 
tein mRNAs. These products were cloned, and the sequences 
at the leader-body fusion sites were determined (Fig. 6). As 
expected, most N protein mRNAs (>60%) contained either 
the authentic TRS-N sequence (UCUAAACU) or the se¬ 
quence corresponding to the leader TRS (UCUCAACU), con¬ 
firming that these TRS elements were efficient in directing the 
synthesis of coronavirus subgenomic mRNAs. Also as ex¬ 
pected, the authentic N protein mRNA leader-body fusion site 
was used when these sequences were present. Similarly, the 
authentic leader-body fusion site was present in three N pro¬ 
tein mRNAs that contained only one nucleotide change com¬ 
pared to the leader or N gene TRSs (one of these sequences, 
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FIG. 5. Sequencing analysis of recombinant HCoV-229E genomes with a randomized N gene TRS. (A) HCoV-229E nt 25663 to 25680 
containing the core sequence of the N gene TRS (boxed) are shown together with the structure and size of the HCoV-229E genome. The 
HCoV-229E nt 25671 AAA 25673 within the TRS-N core sequence (underlined) was changed to a random sequence indicated as NNN. (B) An 
RT-PCR sequencing analysis of in vitro-transcribed HCoV-229E-based genomes with randomized HCoV-229E nt 25671 to 25673 is shown (input 
genomes) (left). These genomes were used for transfection of BHK-21 cells. Three days later, the poly(A)-containing RNA was isolated and 
analyzed by RT-PCR sequencing (reisolated genome) (right). Sequencing results are shown for HCoV-229E nt 25663 to 25680, and the region 
corresponding to the randomized sequence is underlined. (C) RT-PCR products obtained from input genomes or reisolated genomes were cloned 
and sequenced. The analysis comprised 44 individual plasmid clones corresponding to input genomes and 41 individual plasmid clones corre¬ 
sponding to reisolated genomes. On the basis of the sequence determined at randomized nt 25671 to 25673, the recombinant genomes were placed 
in one of four groups: group 1, recombinant genomes with the HCoV-229E wild-type sequence (AAA); group 2, recombinant genomes with a 1-nt 
change compared to the TRS-N or leader-TRS sequence; group 3, recombinant genomes containing a U nucleotide at HCoV-229E position 25673 
(NNU); group 4, sequences that do not match to groups 1 to 3. The percentages of each group in the population of input and reisolated genomes 
are indicated. 
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FIG. 6. Sequence analysis of subgenomic N protein mRNAs. (A) The structure and size of the recombinant HCoV-229E genome with 
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UCUCAUCU, contained a uridine at position 25673). Inter¬ 
estingly, however, we could also determine nine N protein 
mRNA sequences that appeared to contain alternative leader- 
body fusion sites, although each of these mRNAs was only 
found once in our sample. It is noteworthy that eight of these 
“unusual” N protein mRNAs contained either only one nucle¬ 
otide change, compared to the leader or N gene TRS, or a 
uridine at position 25673. Thus, irrespective of which leader- 
body fusion site had been used for the production of a partic¬ 
ular N protein mRNA, 13 of 14 TRS-N elements sequenced in 
this experiment correlated with HCoV-229E genomes that had 


undergone a positive selection or remained at the same level 
during their passage in BHK-21 cells. 

DISCUSSION 

The overall conclusion of this study is that expression of the 
coronavirus N protein facilitates replication of the genomic 
RNA. This is true for N protein that is provided in tram during 
the earliest stages of infection (which may be equivalent to N 
protein brought into the cell as part of the viral nucleocapsid) 
or N protein that is expressed from a replicase-transcriptase 
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complex in the context of a coronavirus-based replicon or an 
infectious coronavirus genome. This conclusion is based upon 
evidence resulting from transfection experiments using coro¬ 
navirus-based vector RNAs and genetic evidence using recom¬ 
binant, full-length coronavirus genomes. Most notably, trans¬ 
fection of coronavirus-based vector RNAs revealed that we can 
only detect the amplification of full-length, HCoV-229E-de- 
rived vector RNAs in transfected cells if they encode a func¬ 
tional N protein gene or if sustained N protein expression is 
ensured in trans (e.g., mediated by coreplicating virus). In 
contrast, subgenomic mRNA transcripts were readily detect¬ 
able, irrespective of the presence or absence of the N gene in 
the vector RNA. The genetic evidence is that HCoV-229E 
genomes that are able to express N protein confer a selective 
advantage for replication in nonpermissive cells. This does not 
mean that genomes that are unable to express N protein will 
not replicate if N protein is provided in trans, but in a trans¬ 
fection experiment where only a small percentage of cells es¬ 
tablish a functional replicase/transcriptase complex, the selec¬ 
tive advantage conferred by N protein expression will be more 
evident. Also, our finding that the PEDV N protein but not the 
MHV N protein can at least partially reproduce the effects 
seen with the HCoV-229E N protein suggests that the HCoV- 
229E replicase complex or RNA genome can interact with N 
proteins derived from closely related viruses (e.g., coronavi- 
ruses of the same group) but not with N proteins from more 
distantly related viruses. 

In addition to the main conclusion, our analysis of the rep¬ 
lication of mutated HCoV-229E genomes led to a number of 
observations that may be relevant to the mechanism of coro¬ 
navirus transcription. First, it is evident that when the authen¬ 
tic N gene TRS is rendered nonfunctional, the virus is able to 
use alternative body leader fusion sites. The analysis of alter¬ 
native leader-body junctions revealed that a stretch of 8 nucle¬ 
otides located upstream of the TRS-N element ( 2564S AAACG 
AAA 25653 ) could also be found immediately downstream the 
leader TRS ( 70 AAACGAAA 77 ), which suggests that base pair¬ 
ing might have taken place at or near to these sequences. The 
actual leader-body junction sites were determined at body 
RNA-derived nucleotides 25640 AC 25642 , 25634 AAC 2S636 ; or 
25645 AAAC 25648 (Fig. 6B). Thus, the leader-body junction was 
either within a putative base-pairing region (as in natural TRS 
elements) or at adjacent upstream sequences containing AC or 
AAC nucleotides. In this context, it is interesting that a com¬ 
mon feature of the order Nidovirales is an open reading frame 
lb-encoded protein that has NendoU activity with a strong 
preference for cleavage at GU(U) sequences in double- 
stranded RNA substrates (18, 46). It has been suggested that 
GU(U) sequences at the 3' terminus of nascent minus-strand 
RNAs, which correspond to conserved AAC nucleotides in the 
core of the HCoV-229E TRS element, might be substrates of 
this activity; therefore, NendoU activity might be involved in 
the transcription of subgenomic mRNAs (18). Our data sup¬ 
port the functional importance of the AAC sequence in 
HCoV-229E TRS elements, but further studies are required to 
provide a direct link to the activities of enzymes such as the 
uridylate-specific endoribonuclease. 

The data we have presented provide substantial evidence for 
a functional role of a structural protein in coronavirus RNA 
synthesis. To our knowledge, this is the first example of a 


structural protein that is involved in the RNA synthesis of a 
nonsegmented positive-strand RNA virus. Among all positive- 
strand RNA viruses, there is only a single group of plant 
viruses, namely, alfamoviruses and ilarviruses from the Bromo- 
viridae family, for which a similar phenomenon has been re¬ 
ported (5, 19, 56). These viruses contain a tripartite positive- 
strand RNA genome with RNA1- and RNA2-encoding 
proteins involved in viral RNA replication and RNA3 giving 
rise to a subgenomic mRNA4 encoding the coat protein. It has 
been shown that a mixture of three genomic RNAs of alfamo¬ 
viruses and ilarviruses is not infectious to plants unless the coat 
protein or mRNA4 is added to the inoculum. This phenome¬ 
non has been termed genome activation and appears to take 
place prior to minus-strand RNA synthesis, most probably 
through binding of coat protein to conserved RNA structures 
at the 3' end of the genomic RNAs, resulting in an enhance¬ 
ment of translation (32, 33). In addition, the coat protein is 
also involved in plus-strand RNA synthesis (“asymmetric plus- 
strand RNA accumulation”); interestingly, the coat protein is 
in fact part of the alfamovirus and ilarvirus replication com¬ 
plexes. Although there are striking similarities to the data 
presented here, further studies are required to elucidate the 
mechanism(s) of N protein function in coronavirus replication. 

In contrast to positive-strand RNA viruses, an involvement 
of structural proteins in virus replication and transcription has 
been described for a number of negative-stand RNA viruses, 
particularly viruses with nonsegmented genomes. Vesicular sto¬ 
matitis virus (VSV; order Mononegavirales, family Rhabdoviri- 
dae ) is one of the most advanced experimental systems, and a 
number of studies related to replication and transcription have 
been reported (3, 58). For example, it has recently been pro¬ 
posed by Banerjee and colleagues (35) that two distinct poly¬ 
merase complexes carry out transcription and replication of 
VSV genomic RNA. The virus proteins L and P (for large 
protein catalytic subunit and the essential phosphoprotein co¬ 
factor, respectively) are part of the transcriptase complex while 
the L, P, and N proteins are part of the replicase complex. It 
has also been proposed that the VSV N protein may promote 
read-through at transcription signals and that VSV replication 
may require a significant amount of N protein for the encap- 
sidation of the nascent strand. Unlike positive-strand RNA 
viruses, the template for replication in the case of VSV is a 
ribonucloprotein complex composed of genome-sized RNAs 
and the N protein (4). Thus, it appears that in many negative- 
strand RNA viruses the processes of replication and transcrip¬ 
tion are tightly regulated by a structural protein. Whether 
distinct enzyme complexes also distinguish coronavirus repli¬ 
cation and transcription and whether the coronavirus N pro¬ 
tein has a regulatory role to play in these complexes remain to 
be determined. 

In summary, the present study complements and extends our 
current understanding of coronavirus replication and tran¬ 
scription. We have identified a structural protein as a factor 
that facilitates coronavirus genome replication; to the best of 
our knowledge, this is unprecedented among nonsegmented 
positive-strand RNA viruses. The functional importance of 
coronavirus N proteins in fundamental aspects of the corona¬ 
virus life cycle, namely encapsidation and replication of virus 
genomes, suggests that the N protein provides an attractive 
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target for antiviral intervention aimed at combating coronavi- 
rus infections. 
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